Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Clone group mapping method based on improved vector space model
CHEN Zhuo, ZHANG Liping, WANG Huan, ZHANG Jiujie, WANG Chunhui
Journal of Computer Applications    2016, 36 (7): 2031-2037.   DOI: 10.11772/j.issn.1001-9081.2016.07.2031
Abstract342)      PDF (1026KB)(314)       Save
Focusing on the less quantity and low efficiency problem of Type-3 clone code mapping method, a mapping method based on improved Vector Space Model (VSM) was proposed. Improved VSM was introduced into the clone code analysis to get an effective clone group mapping method for Type-1, Type-2 and Type-3. Firstly, clone group document was pretreated to get the code document with removing useless word, and the file name, function name and other features of clone group document were extracted at the same time. Secondly, word frequency vector space of clone group was extracted and built; the similarity of clone group was calculated by using cosine algorithm. Then mapping of clone group was constructed by clone group similarity and feature matching, and the result of cloning group mapping was obtained finally. Five pieces of open source software was tested and verified by experiments. The proposed method can guarantee the recall and the precision of not less than 96.1% and 97.1% at low time consumption. The experimental results show that the proposed method is feasible, which provides data support for the analysis of software evolution.
Reference | Related Articles | Metrics
Evolution pattern recognition and genealogy construction based on clone mapping of versions
ZHANG Jiujie, ZHAI Ye, WANG Chunhui, ZHANG Liping, LIU Dongsheng
Journal of Computer Applications    2016, 36 (7): 2021-2030.   DOI: 10.11772/j.issn.1001-9081.2016.07.2021
Abstract446)      PDF (1721KB)(352)       Save
To solve the problems that the method of building clone genealogy is complicated, as well as evolution patterns need urgently expanding, new clone evolution patterns were proposed, and clone genealogy was built automatically based on the mapping relationships of code clones between versions. First, topics of code clones were extracted using Latent Dirichlet Allocation (LDA) from clone detection results in each released software version. Second, mapping relationships of code clones between of versions were confirmed by similarities of the topics. Third, evolution patterns were appended to code clones according to the existing mapping relationships, and evolution features were analyzed. Finally, clone genealogy was built by integrating mapping relationships and evolution patterns together. Experiments of building clone genealogy was conducted on four open source systems. The experimental results show that the proposed approach is feasible, and the proposed evolution patterns really exist in the procedure of software evolution. Further more, it is found that about 90% of code clones in the software systems are stable during evolution, and approximately 67% of clone groups live through less than half of the release versions. The experimental conclusions and relevant analysis provide strongly support for the future research as well as maintenance and management of code clones.
Reference | Related Articles | Metrics
Clone genealogy extraction method based on software code evolution information
CHEN Zhuo, ZHANG Liping, WANG Chunhui
Journal of Computer Applications    2016, 36 (12): 3461-3467.   DOI: 10.11772/j.issn.1001-9081.2016.12.3461
Abstract738)      PDF (1115KB)(384)       Save
The current clone evolution pattern classification is not clear, and clone genealogy extraction tool has less quantity and low efficiency. In order to solve the problems, a clone genealogy extraction method was proposed according to the code clone mapping relationships and evolution information. Firstly, clone group and clone fragment were mapped by word frequency vector calculation, code line distance and clone attribute from different stages. And then the evolution pattern was appended to clone group and clone fragment according to the mapping results. Finally, clone genealogy was constructed by combining clone mapping relationships and evolution pattern in all versions. Four open source softwares were tested and artificially verified in experiments. The experimental results show that the clone genealogy extraction tool-Extract Clone Genealogy (ECG) is valid and efficient. In addition, it is found that about 42% of clone codes have not changed in the evolution process from the extraction results, and about 3.48% of clone codes have inconsistent change, such clones may introduce potential bugs which need to be focused on. The proposed method will provide reference and data support for code clone quality assessment and management.
Reference | Related Articles | Metrics
Clone genealogies extraction based on software evolution over multiple versions
TU Ying, ZHANG Liping, WANG Chunhui, HOU Min, LIU Dongsheng
Journal of Computer Applications    2015, 35 (4): 1169-1173.   DOI: 10.11772/j.issn.1001-9081.2015.04.1169
Abstract974)      PDF (985KB)(624)       Save

Since clone detection results cannot fully reflect the features of clones, clone genealogies extraction from multiple versions can be used to uncover the patterns and characteristics exhibited by clones in the evolving system. A clone genealogy extraction method named FCG was proposed. FCG first mapped clones between each adjacent versions and then identified clone evolution patterns. All of the results were combined to get clone genealogies. Experiments on 6 open source systems found that the average lifetime of clones in current version is over 70 percent of the total number of studied versions, and most of them do not change, which indicates that majority of clones can be well maintained. While some unstable clones may be defect potential, and needs to be modified or refactoring. Results show that FCG can efficiently extract clone genealogies, which contributes to a better understanding of clones and provides insights on targeted management of clones.

Reference | Related Articles | Metrics
Clone code detection based on Levenshtein distance of token
ZHANG Jiujie, WANG Chunhui, ZHANG Liping, HOU Min, LIU Dongsheng
Journal of Computer Applications    2015, 35 (12): 3536-3543.   DOI: 10.11772/j.issn.1001-9081.2015.12.3536
Abstract1271)      PDF (1361KB)(465)       Save
Aiming at the problems of less clone code detection tools and low efficiency for the current Type-3, an effective clone code detection method for Type-3 based on the levenshtein distance of token was proposed. Type-1, Type-2 and Type-3 clone codes could be detected by the proposed method in an efficient way. Firstly, the source codes of a subject system were tokenized into some token sequences with specified code size. Secondly, each definite-sized substring of the token sequences was mapped with corresponding index. Thirdly, the clone pairs were built by the levenshtein distance algorithm and the clone groups were built by the disjoint-set algorithm on the basis of the mapping information query. Finally, the feedback information of clone codes were given. A prototype tool named FClones was implemented. It was evaluated by the code mutation-based framework and compared with two state-of-the-art tools SimCad and NiCad. The experimental results show that the recall of FCloens is equal to or greater than 95% and its precision is not lower than 98% in detecting all of these three types of clone codes. FClones can do better in detecting Type-3 clones than others.
Reference | Related Articles | Metrics
Predicting inconsistent change probability of code clone based on latent Dirichlet allocation model
YI Lili ZHANG Liping WANG Chunhui TU Ying LIU Dongsheng
Journal of Computer Applications    2014, 34 (6): 1788-1791.   DOI: 10.11772/j.issn.1001-9081.2014.06.1788
Abstract171)      PDF (748KB)(401)       Save

The activities of the programmers including copy, paste and modify result in a lot of code clone in the software systems. However, the inconsistent change of code clone is the main reason that causes program error and increases maintenance costs in the evolutionary process of the software version. To solve this problem, a new research method was proposed. The mapping relationship between the clone groups was built at first. Then the theme of lineal cloning cluster was extracted using Latent Dirichlet Allocation (LDA) model. Finally, the inconsistent change probability of code clone was predicted. A software which contains eight versions was tested and an obvious discrimination was got. The experimental results show that the method can effectively predict the probability of inconsistent change and be used for evaluating quality and credibility of software.

Reference | Related Articles | Metrics